From Uncertain Inference to Probability of Relevance for Advanced IR Applications

نویسندگان

  • Henrik Nottelmann
  • Norbert Fuhr
چکیده

Uncertain inference is a probabilistic generalisation of the logical view on databases, ranking documents according to their probabilities that they logically imply the query. For tasks other than ad-hoc retrieval, estimates of the actual probability of relevance are required. In this paper, we investigate mapping functions between these two types of probability. For this purpose, we consider linear and logistic functions. The former have been proposed before, whereas we give a new theoretic justification for the latter. In a series of upper-bound experiments, we compare the goodness of fit of the two models. A second series of experiments investigates the effect on the resulting retrieval quality in the fusion step of distributed retrieval. These experiments show that good estimates of the actual probability of relevance can be achieved, and the logistic model outperforms the linear one. However, retrieval quality for distributed retrieval (only merging, without resource selection) is only slightly improved by using the logistic function.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistic datalog: Implementing logical information retrieval for advanced applications

In the logical approach to information retrieval (IR), retrieval is considered as uncertain inference. Whereas classical IR models are based on propositional logic, we combine Datalog (function-free Horn clause predicate logic) with probability theory. Therefore, probabilistic weights may be attached to both facts and rules. The underlying semantics extends the well-founded semantics of modular...

متن کامل

From Uncerrtain Inference to Agent-Based Information Retrieval

The logical approach to information retrieval (IR) treats retrieval as uncertain inference. In advanced applications, we have to deal with a multi-step inference process involving different system components: information needs, search activities, queries, query intermediaries, databases and documents. In order to provide a better system support for satisfying a user's information need, these co...

متن کامل

Non-zero probability of nearest neighbor searching

Nearest Neighbor (NN) searching is a challenging problem in data management and has been widely studied in data mining, pattern recognition and computational geometry. The goal of NN searching is efficiently reporting the nearest data to a given object as a query. In most of the studies both the data and query are assumed to be precise, however, due to the real applications of NN searching, suc...

متن کامل

Language Models and Uncertain Inference in Information Retrieval

In the logical view on IR systems, retrieval is interpreted as implication [Rijsbergen 86]: Let d denote a document (represented as logical formula) and q a query, then retrieval deals with the task of finding those documents which imply the query, i.e. for which the formula d → q is true. Due to the intrinsic uncertainty and vagueness of IR, we have to switch to uncertain inference. Using a pr...

متن کامل

Information Retrieval with Probabilistic Datalog

The probabilistic logical approach in Information Retrieval (IR) aims at describing the retrieval process as the computation of the probability P (d! q) that a document d implies a query q. Probabilistic Datalog (DatalogP ) is a logic that enables uncertain inference. We use DatalogP as a platform for investigating the probabilistic logical approach in IR. The expressiveness of DatalogP allows ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003